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Abstract 

In  this  paper  we  put  forward  a  framework  that  integrates  features  of  reactive 
planning  models  with  modern  control-theory-based  approaches  to  motion  control 
of  robots.  We  introduce  a  motion  description  language,  MDLe,  that  provides  a 
formal  basis  for  robot  programming  using  behaviors,  and  at  the  same  time  permits 
incorporation  of  kinematic  and  dynamic  models  of  robots  given  in  the  form  of  dif¬ 
ferential  equations.  In  particular,  behaviors  for  robots  are  formalized  in  terms  of 
kinetic  state  machines,  a  motion  description  language,  and  the  interaction  of  the 
kinetic  state  machine  with  real-time  information  from  (limited  range)  sensors.  This 
formalization  allows  us  to  create  a  mathematical  basis  for  the  study  of  such  sys¬ 
tems,  including  techniques  for  integrating  sets  of  behaviors.  In  addition  we  suggest 
optimality  criteria  for  comparing  both  atomic  and  compound  behaviors  in  various 
environments.  We  demonstrate  the  use  of  MDLe  in  the  area  of  motion  planning  for 
nonholonomic  robots.  Such  models  impose  limitations  on  stabilizability  via  smooth 
feedback;  piecing  together  open  loop  and  closed  loop  trajectories  becomes  essential 
in  these  circumstances,  and  MDLe  enables  one  to  describe  such  piecing  together 
in  a  systematic  manner.  A  reactive  planner  using  the  formalism  of  the  paper  is 
described.  We  demonstrate  obstacle  avoidance  with  limited  range  sensors  as  a  test 
of  this  planner. 

This  research  was  supported  in  parts  by  grants  from  the  National  Science  Foundation’s  Engineer¬ 
ing  Research  Centers  Program:  NSFD  CDR  8803012  and  by  the  Army  Research  Office  under  Smart 
Structures  URI  Contract  No.  DAAL03-92-G0121,  ONR  (N00014-J-91-i451),  ARPA  (N00014-94-1090, 
DAST-95-C003,  F30602-93-C-0039),  ARL  (DAAH049610297)  and  the  NSF  Grant  NSF  EEC  94-02384). 
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1  Introduction 


The  field  of  robot  control  is  somewhat  split  between  two  communities  each  producing 
impressive  results.  On  the  control  side,  mathematical  approaches  have  been  extended  to 
design  control  laws  for  more  complex  entities  in  more  realistic  domains.  On  the  reactive 
planning  side,  physical  robots  have  been  designed  that  can  perform  more  complex  tasks 
in  shorter  amounts  of  time.  Unfortunately,  these  separate  strands  have  not  been  coming 
together  with  a  synthesis  of  the  best  ideas  of  both.  In  this  paper  we  provide  an  approach 
that  is  aimed  at  helping  to  bridge  this  gulf,  formalizing  behavioral  robotics  using  modern 
control-theory-based  approaches  to  robot  planning.  We  demonstrate  our  work  in  the 
area  of  motion  control,  a  critical  aspect  of  robotics. 

At  its  highest  level  of  abstraction  motion  control  can  be  viewed  as  the  generation  of 
symbolic  inputs  to  a  control  system  based  on  sensory  information  about  its  current 
state,  desired  state  and  the  state  of  the  environment  it  is  operating  in.  In  the  case  of 
mobile  robots,  these  symbolic  inputs  are  often  commands  such  as  “move”,  “turn”,  “stop” 
etc.  which,  along  with  sensor  information,  can  then  be  used  to  generate  more  complex 
behaviors  such  as  “avoid  obstacle”,  “trace  wall”  etc. 

While  in  many  cases  these  symbolic  inputs  can  be  mapped  into  appropriate  control  laws 
that  can  be  accepted  by  the  system,  often  with  more  complex  systems  this  requires  a 
deep  and  a  complete  understanding  of  the  underlying  dynamics.  In  fact,  in  many  cases 
the  nonlinear  dynamics,  kinematic  (nonholonomic)  constraints  and  limited  control  au¬ 
thority  make  the  generation  of  explicit  control  laws  for  precise  motion  control  (trajectory 
tracking,  point  to  point  locomotion)  exceedingly  difficult.  This  leaves  us  with  imprecise 
behaviors  which  need  to  be  altered  to  meet  the  desired  requirements. 

We  argue  that  motion  control  under  such  situations  is  reduced  to  generating  strings  of 
accepted  symbols  which  can  be  pieced  together  prior  to  initiation  of  motion  and  then 
altered  “on-the-fly”  based  on  real-time  input  from  sensors.  An  important  factor  for  a 
motion  control  strategy  of  this  nature  is  a  hybrid  architecture  that  serves  as  an  abstraction 
between  continuous  and  discrete  (symbolic)  control.  In  addition  it  is  important  that 
this  framework  integrate  real  time  sensor  information  into  primitive  behaviors  in  such 
a  way  as  to  incorporate  intelligent  switching  between  behaviors,  to  facilitate  planning 
and  learning.  Inputs  to  such  a  hybrid  control  system  are  symbol  strings  and  continuous 
and  discrete  inputs  from  sensors.  Outputs  are  continuous  signals  to  actuators.  The 
input  strings  can  be  thought  as  being  a  part  of  a  structured  language  [Brockett,  1990; 
Manikonda  et  al,  1995b],  which  is  rich  enough  to  encode  sensor  information  and  the 
model  (essentially  differential  equations),  and  at  the  same  time  provide  a  set  of  rules  for 
concatenation,  switching  etc. 

Earlier  work  that  discusses  some  of  these  aspects  of  motion  control  as  applied  to  robotics 
can  be  found  in  [Brooks,  1986;  Arbib,  1992;  Arkin,  1992],  Brooks  [Brooks,  1986]  uses 
task-achieving  behaviors  as  the  primary  level  of  task  decomposition.  He  introduces  the 
concept  of  a  subsumption  architecture  which  is  essentially  a  structured  and  layered  set 
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of  behaviors  with  increasing  levels  of  competence.  These  “reactive”  systems  typically 
exploit  domain  constraints,  using  clever  algorithms  to  allow  fast  processing  of  complex 
sensor  information  (cf.  [Horswill,  1993]).  Arbib  and  Arkin  (c.f.  [Arbib,  1992;  Arkin,  1992] 
and  references  therein)  have  applied  schema  theory  to  the  robotics  domain.  However  as 
discussed  in  [Arbib,  1992]  there  is  no  consensus  view  as  to  what  constitutes  schema 
theory. 

Although  these  approaches  have  significant  advantages  from  the  point  of  view  of  architec¬ 
tural  design  and  programming  flexibility,  they  have  resisted  mathematical  formalization1 
and  are  not  amenable  to  tests  for  optimality.  Comparing  two  sets  of  behaviors,  even 
within  the  same  task,  is  complex  and  the  domain-dependent  nature  of  the  solutions  can 
cause  these  systems  to  be  basically  incommensurate  -  one  may  fail  some  times,  one  may 
fail  at  other  times  and  comparison  is  difficult. 

On  the  other  hand,  control  theoretic  approaches  to  motion  control  have  traditionally 
required  detailed  mathematical  models  of  the  system,  its  environment  and  state,  to  de¬ 
sign  control  laws/algorithms  to  steer  the  system.  In  addition  mobile  robots  are  often 
approximated  as  points  or  disks  and  dynamic  models  assume  perfect  sensors  and  state 
information,  making  implementation  of  these  algorithms  in  the  real  world  difficult.  In 
practice,  however,  autonomous  systems  have  little  a  priori  information  about  their  envi¬ 
ronment,  have  limited  range  sensors  and,  in  addition,  dynamics  can  get  complicated  (see 
the  discussion  on  nonholonomic  robots  in  section  4)  making  the  design  of  explicit  control 
laws  to  steer  the  system  along  a  desired  trajectory  increasingly  difficult. 

The  inability  to  integrate  robot  model  dynamics  with  real  time  sensor  information  stems 
from  the  lack  of  a  powerful  enough  framework  to  integrate  the  two  approaches  (control 
theoretic  vs  behavior-based).  This  paper  is  a  step  in  the  direction  of  providing  such  a 
framework,  integrating  features  of  reactive  planning  with  modern  control-theory-based 
approaches  to  motion  planning.  First  we  introduce  a  motion  description  language,  MDLe, 
that  provides  a  formal  basis  for  robot  programming  using  behaviors,  and  at  the  same 
time  permits  incorporation  of  kinematic  models  of  robots  given  in  the  form  of  differential 
equations.  The  structure  of  the  language  MDLe  (based  on  Brockett’s  MDL[Brockett, 
1990])  allows  descriptions  of  triggers  (generated  by  sensors)  in  the  language.  Feedback 
and  feedforward  control  laws  are  selected  and  executed  by  the  triggering  events.  Secondly 
we  present  a  hierarchical  and  distributed  hybrid  architecture  for  generation  and  execution 
of  behaviors  and  planning  algorithms  developed  under  the  formalism  of  MDLe. 

While  MDLe  and  the  hybrid  architecture  provide  a  formalism  to  capture  and  express  be¬ 
havioral  and  control-theoretic  aspects  of  a  large  class  of  systems,  including  some  biological 
aspects,  we  find  that  MDLe  is  particularly  well  suited  to  the  demands  of  nonholonomic 
path  planning  with  limited  range  sensors.  As  an  example  of  the  strength  of  this  language, 
we  show  that  it  can  be  used  to  support  a  reactive  planner  for  nonholonomic  motion  plan¬ 
ning  in  the  presence  of  obstacles,  using  limited  range  sensors  for  obstacle  detection.  Some 
background  on  nonholonomic  constraints  and  a  discussion  on  earlier  approaches  to  path 

1  However  see  [Lyons  and  Arbib,  1989]  Robot  Schema  Language  (RS) 
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planning  with  nonholonomic  robots  are  also  presented. 


2  MDLe:  A  Language  for  Motion  Control 


We  treat  an  autonomous  robot  as  a  kinetic  state  machine  (following  Brockett  [Brockett, 
1990])  which  can  be  thought  of  as  a  continuous  analog  of  a  finite  automaton.  In  the 
framework  of  MDLe  these  kinetic  state  machines  are  governed  by  differential  equations 
of  the  form 


m 

X  =  f(x )  +  Y^gi{x)Ui] 
i=l 


y  —  h(x)  £  Rp 


(1) 


where 


x(-)  :  R+  =  [0,  oo)  ->•  R71 

Ui  :  R+  x  Rp  R 

(t,y(t))  M-  Ui(t,y(t)) 

Further  each  gi  is  a  vector  field  in  Rn. 

We  now  define  the  atoms  of  the  motion  language,  denoted  by  cr,  as  triples  of  the  form 
Oi  =  (U,  £a,  Ta)  where 

Ui  —  (wj,  •  •  •  um) 


where  each  Uj  is  as  defined  earlier 


£*:  Rfc  ->  {0,1} 

s(t)  ^  £a(s{t)) 


is  a  boolean  function,  Ta  £  R+  and  s(-)  :  [0,  T]  — >  R*  is  a  1  dimensional  signal  that 
represents  the  output  of  the  k  sensors.  £  can  be  interpreted  as  an  interrupt  or  trigger  to 
the  system  which  is  activated  in  a  case  of  emergency  or  change  in  the  environment,  e.g. 
the  robot  gets  too  close  to  an  obstacle. 

Let  us  denote  by  T°,  (measured  with  respect  to  the  initiation  of  the  atom)  the  time  at 
which  an  interrupt  was  received  i.e.  £a  changes  state  from  1  to  0.  2 

2The  definition  of  an  atom  here  can  be  compared  with  that  in  MDL  where  Brockett  treats  time-outs 
in  T ,  instead  of  giving  explicit  status  to  triggers. 
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If  the  kinetic  state  machine  receives  an  input  string  o\  •  •  •  an  =  (U\,  Tf)  •  •  •  (Un,  T“) 
then  the  state  x  will  evolve  according  to 

x  =  f(x)  +  G(x)Ui ,  t0  <t  <  t0  +  mm[T1a,T“], 

:  ~  (2) 

x  =  f(x)  +  G(x)Un,  t0  + - h 

<t  <  t0  H - h  ™[f“,T“] 

where  G  =  (pi(x),  •  •  •  £m(z))- 

Hence  each  atom  in  the  input  string  is  executed  in  sequential  order,  execution  of  a 
particular  atom  being  inhibited  either  via  interrupts  or  a  “time  out”  via  the  timer  T°. 

We  denote  a  kinetic  state  machine  as  a  six-tuple  ( U ,  X,  y,  S,  h,  £),  where 

U  =  Co0(R+  x  Rp;  Rm)  is  an  input  (control)  space, 

X  =  Rn  is  the  state  space, 

3^  =  Rp  is  an  output  space, 

S  C  Rfc  is  the  sensor  signal  space, 

h  :  X  — >  y  maps  the  state  space  to  the  output  space 

and 

£  :  S  — >  {0, 1}  is  an  interrupt. 

As  another  point  of  departure  from  MDL,  we  find  it  useful  to  bring  input  scaling  into 
the  picture.  This  provides  considerable  flexibility  as  will  be  seen  in  later  sections. 

Definition:  Given  an  atom,  (U,t;a,Ta),  define  (aU,  (3Ta),  a  —  (a1,  •  •  • ,  am)  £  Rm, 
/ 3  £  R+  as  the  corresponding  scaled  atom  and  denote  it  as  (a,  P)(U,f;a,Ta). 

Hence  a  scaling  is  used  to  scale  each  input  and  scaling  is  used  to  scale  the  time  for 
which  an  atom  is  to  be  executed. 

Definition:  An  alphabet  E  is  a  finite  set  of  independent  atoms,  i.e  ( U ,  £°,  Ta )  triples. 
Thus  E  =  {a1;  •  •  • ,  <rn}  for  some  finite  n  where  cq  denotes  the  triple  (Ui,  T“),  such 
that  Oi  7^  (a,  (3)(oj),  ol  £  Rm,  P  £  R+  and  i  =  1,  •  •  •  n,  j  =  1  •  •  •  n. 

Hence  an  alphabet  is  a  set  of  atoms  none  of  which  can  be  derived  from  other  atoms  in 
the  alphabet  via  scaling. 

To  simplify  notation  in  the  rest  of  the  discussion  we  denote  the  scaled  atom  (1,  l)<jj 
simply  by  crl. 

Definition:  An  extended  alphabet  Ee  is  the  infinite  set  of  scaled  atoms,  i.e.  triples 
(ctiUi,  £“ ,  (3iTta)  derived  from  the  alphabet  E. 

Definition:  A  language  E*  (respectively  E*)  is  defined  as  the  set  of  all  strings  over  the 
fixed  alphabet  E  (respectively  extended  alphabet  Ee). 
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Definition:  A  behavior,  denoted  by  w  is  an  element  (i.e.  word)  of  the  extended  language 
E*,  with  an  associated  timer  Tb  and  interrupt  For  example,  given  an  alphabet  E  = 
a  behavior  7r,  could  be  tt*  =  ((aH,Ph)aH  (ai2,  pi2)ah  (ai3, /3i3)ah),Tb,£b). 

The  notation  a ^  is  used  to  denote  the  jth  atom  in  the  ith  behavior.  Similarly  oii  ,  (5i] 
correspond  to  scaling  factors  of  the  jth  atom  in  the  ith  behavior. 

Often  we  will  have  to  work  with  atomic  behaviors  (behaviors  with  a  single  atom)  with 
£6  —  £a  and  Tb  =  Ta.  In  such  situation  to  simplify  notation  we  will  denote  atomic 
behaviors  simply  by  7 r*  =  ((o;n,  Ai)*7*),  dropping  explicit  reference  to  £6  and  Tb . 

Interrupts  associated  with  atoms  (£a)  are  called  level— 0  interrupts.  Interrupts  associated 
with  behaviors  (£6)  are  called  level  —  1  interrupts.  If  a  level-0  interrupt  is  received  while 
an  atom  of  a  behavior  is  being  executed,  the  execution  of  that  particular  atom  is  inhibited 
and  the  next  atom  in  that  behavior  is  executed.  If  a  level-1  interrupt  is  received  while 
a  behavior  is  being  executed,  the  execution  of  the  entire  behavior  is  inhibited  and  the 
next  behavior  (if  there  exists  one)  is  executed.  The  interrupts  £  may  be  results  of  a 
complicated  processing  of  sensory  information.  In  the  simplest  case  however  they  may 
involve  thresholding  of  sensory  information. 

Since  each  atom,  when  executed  by  a  kinetic  state  machine,  combines  in  general  both 
open  loop  and  feedback  controls,  one  could  argue  that  our  definition  of  behavior  captures 
some  aspects  of  the  essence  of  locomotion  behavior  [Bernstein,  1967],  as  well  as  the  sense 
in  which  the  term  is  used  in  [Brooks,  1986].  Further  the  passage  from  atoms  to  behaviors 
to  plans  suggests  (as  we  shall  see  in  section  3)  a  layered  architecture.  We  also  argue 
that  MDLe  captures  the  salient  features  of  various  architectures/approaches  to  model 
behaviors.  Comparing  it  with  the  schema  approach  of  [Arkin,  1988;  1992;  Arbib,  1993; 
1992],  one  observes  that  an  atom  incorporates  both  “motor”  (controls)  and  “perceptual” 
(interrupt  functions)  schemas  into  one  unit. 

The  introduction  of  the  timer  Ta(Tb)  servers  two  proposes,  (i)  It  serves  as  the  clock 
for  the  evolution  of  the  differential  equations  i.e.  if  an  open  loop  control  is  an  input  to 
the  kinetic  state  machine  then  the  timer  interrupt  can  be  used  to  turn  off  the  control  at 
the  desired  time  Ta.  (ii)  It  guarantees  that  no  behavior  will  be  executed  forever.  For 
example  if  the  desired  behavior  was  -  “move  towards  a  wall”  and  either  the  sensors  (used 
to  detect  the  wall)  were  defective,  or  the  wall  did  not  exist,  then  the  timer  guarantees 
that  the  atom  (behavior)  is  only  executed  for  a  maximum  period  of  time  Ta(Tb).  With 
the  introduction  of  scaling  factors  and  a  hierarchy  of  interrupts  we  provide  for  “directed 
control”  (the  term  as  used  in  [Blumberg  and  Galyean,  1995]  ),  optimization  and  learning. 
These  aspects  will  be  discussed  later. 

Before  proceeding  any  further  on  the  structure  of  the  language  we  discuss  an  example 
from  Rana  computatrix  (also  discussed  in  [Arbib,  1993;  1992])  to  model  the  visuomotor 
coordination  in  frogs  and  toads  with  the  purpose  of  pointing  out  a  biological  motiva¬ 
tion/application  of  MDLe.  Applications  to  autonomous  robots  are  discussed  in  detail  in 
later  sections. 


6 


Example  1:  It  has  been  observed  that  frogs  and  toads  approach  small  moving  objects 
(assuming  they  are  prey)  and  move  away  from  large  ones  (assuming  that  they  might 
be  predators).  It  was  hypothesized  that  the  tectum  (visual  region  in  the  animal’s  mid 
brain)  was  responsible  for  recognizing  small  objects  and  the  pretectum  processed  visual 
information  and  determined  which  objects  were  large.  If  one  assumes  a  model  in  which  the 
prey  seeking  behavior  is  activated  by  an  input  from  the  tectum  and  the  predator  avoiding 
behavior  is  triggered  by  a  signal  from  the  pretectum,  then  a  lesion  in  the  pretectum  should 
leave  the  frog  or  toad  unresponsive  towards  large  objects.  However  it  was  observed  that 
a  frog  with  a  lesioned  pretectum  approaches  both  large  and  small  objects  while  not 
exhibiting  an  avoidance  behavior. 

We  model  this  in  MDLe  as  follows  -  define  two  boolean  functions  Stectum,  Spretectum  '■  W  — > 
{0, 1}  that  process  visual  information  from  the  world  (environment)  W.  We  assume  that 
visual  information  is  passed  both  to  the  tectum  and  the  pretectum,  where  each  evaluates 
Stectum  and  Spretectum  respectively.  These  boolean  functions  are  defined  a  follows: 


'  tectum 


1  if  object  is  small 
0  if  object  is  large 


spretectum 


0  if  object  is  small 
1  if  object  is  large 


Further  lets  assume  that  we  have  modeled  the  motion  of  the  frog,  and  Uapproach  and 
Uretreat  are  controls  that  result  in  a  motion  towards  and  away  from  the  prey  and  predator 
respectively.  Let  us  define  two  atoms  as  follows: 

Ox  =  (Uapproach,  Cl,  Ti)  where  Cl  =  ( Stectum  V  Spretectum)  ~  “APPROACH  ATOM” 


02  =  (Uretreat,  Cl,  T%)  where  Cl  =  Spretectum  ~  “RETREAT  ATOM” 

Here  “V”  denotes  the  logical  OR  and  denotes  the  logical  NOT.  Define  a  behavior  7 r 
as  follows: 

■*  =  ((oio2y,cb,Tb) 

where  (ai^)*  defines  the  infinite  string  aia2aia2  •  •  •  0\(j2  •  •  ••  The  behavior  interrupt 
and  timer  are  chosen  to  interrupt  this  behavior  after  some  prescribed  time  Tb  or  via  Cb 
in  case  of  undesirable  changes  in  the  environment.  To  understand  the  working  of  the 
above  behavior  assume  that  the  moving  object  was  small.  Hence  o\  will  be  executed 
(as  £i  =  1  :  Stectum  =  1  and  Spretectum  =  1),  for  a  period  T®  after  which  execution  of  a2 
will  begin.  But  since  C2  is  0  (Spretectum  returns  a  0  for  small  objects),  a2  is  not  executed 
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and  cr i  is  executed  again.  This  process  will  continue  for  a  maximum  period  of  Tb  unless 
interrupted  by  £6.  Now  assume  that  the  moving  object  was  large.  Again  execution  of  <j\ 
will  fail  ( Stectum  =  0,  Spretectum  =  0  and  hence  £1  =  0)  and  only  cr2  will  be  repeatedly 
executed. 

We  observe  that  the  behavior  tx  models  the  response  of  a  frog  to  moving  objects.  It  also 
fits  with  the  observation  that  a  frog  with  a  lesioned  pretectum  will  move  towards  both 
small  and  large  objects:  Without  loss  of  generality  assume  that  a  lesion  in  the  pretectum 
results  in  Spretectum  =  0 ,  Vi.  Hence  we  observe  that  <72  is  always  inhibited  but  a\  will 
always  be  active  for  all  moving  objects.  One  observes  that  implicit  in  this  model  is  that 
the  “approach”  atom  processes  information  from  both  the  tectum  and  the  pretectum. 
(Compare  with  the  schema  model  suggested  by  Arbib).  □ 

On  a  separate  note  one  should  observe  that  the  sequential  execution  of  atoms  in  a  be¬ 
havior  does  not  preclude  the  fact  that  behaviors  cannot  be  executed  in  parallel.  It  just 
ensures  that  the  same  kinetic  state  machine  (essentially  the  differential  equations)  does 
not  receive  two  sets  of  conflicting  inputs  at  the  same  time.  A  large  complex  system  (which 
would  obviously  be  difficult  to  model)  could  be  modeled  as  several  kinetic  state  machines 
each  one  of  them  evolving  in  parallel  and  interacting  through  the  behavior  interrupts  £b. 

Definition:  The  length  of  a  behavior  denoted  by  |?r|  is  the  number  of  atoms  (or  scaled 
atoms)  in  the  behavior. 

Definition:  The  duration  T{n)  of  a  behavior 


TTj  —  (oii  1  j  (3ix )  (Uii  ,  ,  Tjj  )  (ch;  >  Pit )  (^ij  >  \  ) 


executed  beginning  at  time  t0  is  the  minimum  of  the  sum  of  the  time  intervals  for  which 
each  of  the  atoms  in  the  behavior  was  executed  and  the  prescribed  time  for  which  the 
behavior  was  expected  to  be  executed.  That  is, 

T(7Tj)  =  min[  ( min[T f,  -| - 1-  ,  Tb]  (3) 

Definition:  Given  a  kinetic  state  machine  and  a  world-model,  a  plan  F  is  defined  as 
an  ordered  sequence  of  behaviors  which,  when  executed,  achieves  the  given  goal.  For 
example  a  plan  T  =  {7r37ri7rn  •  •  •}  could  be  generated  from  a  given  language  where  each 
behavior  is  executed  in  the  order  in  which  they  appear  in  the  plan.  The  length  of  a  plan 
T  is  given  by  |T|  =  \^i\  and  the  duration  of  the  plan  is  given  by  T(T)  =  ^  Tfc).  In 

a  particular  context  there  may  be  more  than  one  plan  that  achieves  a  given  goal. 

Example  2.  Consider  the  problem  of  path  planning  for  a  robot  unicycle,  with  a  single 
sensor,  that  wanders  around  in  a  given  environment  without  colliding  into  obstacles 
(analogous  to  the  idea  of  the  first  level  of  competence  in  Brooks  [Brooks,  1986]  ).  Let  us 
assume  the  task  of  the  robot  (unicycle)  in  this  case  is  to  wander  till  it  senses  an  obstacle. 
If  it  senses  an  obstacle  it  avoids  the  obstacle  and  continues  to  wander  around.  We  now 
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formulate  this  problem  treating  the  unicycle  with  its  sensor  as  a  kinetic  state  machine, 
and  find  a  plan  that  solves  the  problem.  The  differential  equations  governing  the  kinetic 
state  machine  are 


X 

-  v\  cos  9 

(4-a) 

y 

=  v\  sin  9 

(4-b) 

9 

=  v2 

(4-c) 

where  ( x ,  y )  6  H2  denotes  the  position  of  the  unicycle  w.r.t  some  inertial  frame,  9  denotes 
the  orientation  of  the  unicycle  relative  to  the  horizontal  axis  and  v\  and  v2,  the  velocity 
of  the  unicycle  and  the  angular  velocity  respectively  are  the  inputs  to  the  kinetic  state 
machine.  With  reference  to  the  standard  notation  (1),  we  identify  ui  =  vi,  u2  =  v2, 
gi  =  (cos 0, sin#, 0)'  and  g2  =  (0,0,1)'. 

To  generate  the  “wander  behavior”  (wander  in  a  given  environment  without  colliding 
into  obstacles)  let  us  consider  the  following  atoms: 

°\  =  (Ui,€i,Tf)  where 

Ui  =  (i,oy 

(1  if  p  >  10 
\  0  if  p  <  10 

Tf  €  (0,  oo) 


and  where  p  is  the  distance  between  the  robot  and  the  obstacle  that  is  returned  by  the 
sensor. 


02  =  (£/2,£2,T2“)  where 


U2  = 

T“e 


(o,i  Y 

j  0  if  p  >  10 
\  1  if  p  <  10 
(0,oo),  and 


<73  =  (U3,t3,Tg)  where 


^3  =  (o,iy 

e3a  =  i 

r3a  €  (0,  oo) 


Let  a  —  (ad,  a2)  with  each  a1  €  [ki,k2],  ki,k2,E.  1R  and  (5  G  [0,  oo].  Now  consider  the 
following  atomic  behaviors  3 

7Ti  =  (ai,/?i)(I/i,fi,  1) 

3  See  remark  on  notation  for  atomic  behavior  immediatlely  following  the  definition  of  behavior 
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*2  =  knJhiW-t.Q,  1) 


<i  =  K«PiS,i) 

Based  on  the  equations  of  this  robot,  the  behavior  7Ti  is  interpreted  as  “move  forward” 
with  a  velocity  of  a\  units/sec  for  Pi  seconds,  and  behaviors  7r2  and  7 r3  can  be  interpreted 
as  “turn”  with  a  velocity  of  a\  deg/sec  for  maximum  of  p\  seconds  (here  i  =  2, 3)  unless 
interrupted.  As  explained  earlier  the  atoms  of  each  behavior  will  only  execute  as  long  as 
their  respective  £  functions  are  1  and  the  time  of  execution  is  less  than  T.  Since  £3  =  1 
in  the  entire  interval,  [0,/?3],  once  7r3  begins  executing  it  continues  until  t  =  /?3 


Figure  1:  Trajectory  and  Inputs  Generated  by  the  Plan  Pi 
Consider  the  plan 

with  =  (5,  0),  Pi  =  100,  a2  =  (0,  -1),  p2  —  90,  £p  =  1  and  Tp  very  large.  Observe 
that,  if  this  plan  is  executed  in  the  environment  (with  walls.  W1  and  W2)  as  shown  in 
Fig.  1,  the  robot  will  move  forward  for  time  t,  t0  <  t  <  t0  +  100,  where  t0  is  the  time 
at  which  the  behavior  was  started,  when  £x  will  interrupt4  it  (too  close  to  Wl).  Let  us 
assume  that  the  interrupt  was  received  at  to  +  Tj .  The  execution  of  behavior  7rx  is  then 
inhibited,  behavior  7 r3  is  picked  up  from  the  queue  and  is  executed.  As  £3  =  1  in  the 
entire  interval  t  €  [f0  +  Tx,  to  +  Tx  +  90)  the  robot  will  then  turn  clockwise  by  90  degrees 
and  then  it  will  again  execute  7rx  i.e.  move  forward.  But  again  after  some  finite  time 
wall  W2  (see  Fig.  1)  will  cause  £x  =  0  and  hence  interrupt  the  move  forward  behavior. 
Behavior  7 r3  is  executed  as  earlier  i.e.  the  robot  turns  clockwise  by  90  degrees,  and  now 
continues  to  move  forward.  If  it  does  not  detect  an  obstacle  at  the  end  of  100  seconds 
since  it  began  moving  forward,  it  will  stop,  turn  clockwise  90  degrees,  and  continue  to 
repeat  the  sequence  of  actions. 

Now  consider  the  plan  T2  =  {((ax,  A)7ri(Q!2,  ^2)^2)*,  £p,  Tp}  with  ax  =  (50,0),  pi  = 
2,  a2  =  (0,  —20)  and  p2  =  5.  Observe  that,  if  this  plan  is  executed  in  the  same  envi¬ 
ronment  (see  Fig.  2),  then  while  executing  “move  forward”  i.e.  7rx,  in  the  time  interval 

4We  drop  the  superscript  on  q  and  T 
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t0  <  t  <  t0  + 2  the  robot  realizes  that  the  obstacle  is  at  a  distance  less  than  10  units  from 
it  and  hence  £1  interrupts  “move  forward  ”  and  the  robot  begins  to  execute  “turn  right” . 
Due  to  the  choice  of  the  interrupt  function  £2  the  robot  will  now  switch  between  “turn 
right”  and  “move  forward”  (a  condition  referred  to  as  chattering)  and  trace  a  trajectory 
as  shown  in  the  figure.  Hence  depending  on  the  choice  of  the  alphabet  one  can  generate 
different  plans  to  achieve  the  same  task.  □ 


Figure  2:  Trajectory  and  Inputs  Generated  by  the  Plan  P2 

The  question  of  how  to  generate  a  plan  given  an  alphabet  and  a  kinetic  state  machine, 
is  an  open  one  and  it  largely  depends  on  the  task  and  the  planner.  In  section  4  we 
describe  a  path  planner  for  nonholonomic  robots  in  which  we  attempt  to  answer  related 
questions  regarding  existence  and  choice  of  alphabets.  Before  we  discuss  the  features  of 
the  planner  we  introduce  some  more  definitions  that  help  formalize  measures  to  evaluate 
the  performance  of  a  plan. 


2.1  Performance  Measure  of  a  Plan 

At  first  it  appears  that  generating  a  plan  to  steer  a  system  from  a  given  initial  state  x0  to 
a  final  state  xj  requires  complete  a  priori  information  of  the  world,  which  is  not  available 
in  many  instances  of  path  planning.  In  the  absence  of  such  complete  a  priori  information 
about  the  world  W,  the  planning  system  has  to  generate  a  sequence  of  plans  based  on  the 
limited  information  about  W  that  it  has.  Each  such  plan  will  only  achieve  an  intermediate 
goal.  Concatenation  of  these  plans  will  achieve  the  desired  goal.  In  MDLe  each  of  these 
plans  is  called  a  partialplan  and  is  denoted  by  T?  =  •  •  •  nln,  ,  Tf),  where  is  an 

interrupt,  which  when  set  to  0,  inhibits  the  execution  of  the  partial  plan  and  Tf  is  the 
prescribed  time  for  which  the  partial  plan  is  to  be  executed.  Here  the  notation  7rj  is  used 
to  denote  the  jth  behavior  in  the  ith  partial  plan.  Interrupts  associated  with  partial 
plans  are  referred  to  as  level-2  interrupts. 

Remark:  As  a  partial  plan  is  generated  with  limited  information  of  the  world,  not  all 
the  behaviors  and  not  every  atom  in  a  behavior  generated  by  the  partial  plan  may  be 
executed  at  run  time  for  the  following  reasons: 

(i)  Let  us  consider  a  behavior  7Tj  =  (cr3<Ti<74  •  •  •  crn,£f,Tf).  Let  us  assume  that  the  atom 
cr3  is  interrupted  by  £3  at  T3a.  Now  as  explained  earlier  ox  will  begin  to  execute.  But  if 
£3  =  ,  0i  will  not  be  executed  and  again  depending  on  £|,  <74  will  begin  to  execute. 
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(ii)  While  executing  an  atom  in  a  particular  behavior,  a  level-1  or  a  level-2  interrupt 
might  be  received,  and  hence  the  remainder  of  the  atoms  in  that  particular  behavior  will 
not  get  executed.  □ 

Given  an  algorithm  that  generates  a  plan  T  we  define  a  candidate  measure  of  performance 
0(r)  of  the  plan  as 

e(T)  =  r(r)  +  r|r|  (5) 

where  r  is  a  normalizing  factor  having  the  units  of  time.  (One  need  not  limit  oneself  to 
such  additive  combinations  although  this  is  the  only  case  used  here.) 

Defining  a  performance  measure  for  a  path  planner  is  a  difficult  task  as  it  is  largely 
dependent  on  the  goal  the  robot  seeks  to  achieve.  Some  path  planners  use  the  total 
time  to  achieve  the  goal  as  a  measure  of  performance.  In  many  situations  one  might 
be  interested  in  not  only  the  time  but  also  on  the  smoothness  of  the  path  traversed  or 
the  number  of  times  switching  between  different  controls  was  necessary.  For  example 
consider  the  task  of  parallel  parking  of  a  car.  One  might  be  able  to  achieve  the  goal 
by  using  only  open-loop  controls  but  switching  between  them  at  regular  intervals,  hence 
possibly  reducing  the  time  to  achieve  the  goal  but  compromising  on  the  smoothness  of  the 
path.  On  the  other  hand  if  one  uses  a  time  dependent  feedback  law,  the  same  task  could 
be  achieved,  possibly  by  moving  along  a  smooth  trajectory  at  the  risk  of  taking  longer 
time  to  achieve  the  goal.  This  indicates  a  trade-off  between  two  competing  requirements 
which  is  captured  by  the  performance  measure  (5). 

We  now  define  the  optimal  performance  of  a  plan  as 


0(r  )optimai  =  min{T(T)  +  r|r|}.  (6) 

Here  the  minimization  is  performed  over  the  subset  of  plans  generated  by  the  subset  B 
of  admissible  behaviors.  Depending  on  the  kinetic  state  machine  and  the  choice  of  the 
planner  one  can  now  place  bounds  on  the  optimal  performance  and  hence  compare  the 
performance  of  different  planners  given  the  same  language  or  that  of  the  planner  given  a 
new  language.  This  is  illustrated  in  the  example  given  below. 


Example  3:  Consider  the  problem  of  steering  the  unicycle  from  a  given  initial  location 
z0  to  Zf.  The  equations  of  the  unicycle  are  given  in  example  2.  Let  us  assume  that 
the  language  is  based  on  the  following  atoms,  cq  =  (C4,£“,  1),  a2  =  (C/2, ^2 , 1)  where 
UuQ,U2,$  are  as  defined  in  example  1.  Let  a  =  (a1,  a2)  with  each  a 1  €  [-5,4-5]  and 
P  6  (0,  00). 

Let  us  also  assume  that  the  planner  did  not  have  complete  information  about  the  world 
and  had  to  generate  n  partial  plans  to  achieve  the  goal.  Each  partial  plan  consists  of 
steering  the  unicycle  from  z*  to  Zj  (see  Fig  3)  such  that  there  are  no  obstacles  in  some 
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small  neighborhood  of  the  line  segment  joining  these  two  locations.  Let  us  make  a  further 
assumption  that  the  planner  uses  a 1  €  {1,-1}  as  the  scaling  factor  while  generating 
partial  plans. 


Figure  3:  Partial  Plan  Generation 


From  the  kinematic  equations  of  the  unicycle  we  know  that  a  simple  partial  plan  to  steer 
a  unicycle  from  Zi  =  (xi,yi,9i)  to  z3  =  ( Xj,yj,9j )  would  be  : 

(i)  turn  by  (9ZiZj  -  9{), 

(ii)  move  by  a  distance  di  and 
(hi)  finally  turn  by  (9,  -  9Z.Z.), 

where  is  the  vector  in  R2  joining  ( Xi,yi )  and  ( Xj,yj ),  di  =  ||z;z;-||2  and  9ZjZ.  is  the 
orientation  of  the  vector  w.r.t.  to  the  x-axis. 

We  can  rewrite  this  simple  algorithm  as  a  partial  plan  derived  from  the  language  using 
the  atomic  behaviors  7Ti  =  cri  and  tt2  =  cr2, 


where  9ix  and  9i2  are  the  angles  of  the  two  turns  as  described  above  of  the  ith  partial 
plan.  Hence  the  plan  to  steer  the  system  from  za  to  zj  is  given  by 

r  =  {rp1rv--rM- 

Given  a  plan  we  now  illustrate  how  bounds  can  be  placed  on  the  optimal  performance 
based  on  the  knowledge  of  the  kinetic  state  machine  and  the  language.  Let  dmax  = 
max  ||zjZj||. 

Tmax(TPi)  <  27T  +  dmax  +  27T  <  47T  +  dmax 
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and  \VPi\max  <  3-  Hence, 


0  <  0(r)  <  3n  +  n(47r  +  dmax). 

However,  as  we  are  using  only  open-loop  controls,  we  know  from  the  kinematics  of  the 
system  that  given  an  initial  state  x0  and  a  final  state  Xf  both  the  behaviors  (a*, 
and  (kcti,  (3i/k)cri  would  steer  the  kinetic  state  machine  from  the  initial  state  to  the  final 
state.  Hence  we  could  replace  by  (kai,fii/k)cri. 

Observe  that  in  the  generation  of  the  above  partial  plan  and  in  the  calculation  of  the 
performance  measure  we  restricted  a1  to  {—1,1}  (in  some  sense  placed  bounds  on  the 
speed  of  the  unicycle)  because  the  planner  did  not  have  complete  information  about  the 
world.  But  since  the  language  permits  a1  £  [—5,5],  we  have 

o  <  6(1%*™,  <  -(4,r  t  +  3n. 

o 

Having  placed  bounds  on  a  plan  generated  by  one  set  of  behaviors  we  can  now  compare 
the  performance  of  another  set  of  behaviors  (for  example,  one  using  periodic  functions 
to  steer  the  robot)  against  these  bounds.  □ 

In  the  above  examples  we  have  used  very  simple  controls  in  our  alphabet.  But  one  should 
note  that  depending  on  the  application,  a  wide  variety  of  controls  (open  loop  and  closed 
loop)  could  be  included  in  the  alphabet  and  some  examples  of  such  controls  can  be  found 
in  [Murray  and  Sastry,  1990;  Sussmann,  1991;  de  Wit  and  Sordalen,  1992;  Coron,  1992; 
Pomet,  1992;  Leonard  and  Krishnaprasad,  1993] 


3  Hybrid  Architecture 


As  we  seek  to  attain  higher  levels  of  autonomy  in  robots,  the  need  for  hierarchical  and 
distributed  control  schemes  becomes  apparent.  Motivated  in  part  by  the  hierarchical 
structure  of  neuromuscular  control  systems  [Bernstein,  1967]  we  present  a  control  archi¬ 
tecture  (see  Fig.  4),  to  generate  and  execute  plans  to  achieve  a  given  task.  The  lowest 
level  is  the  kinetic  state  machine  where  the  sensors  are  used  in  a  low-level  feedback 
loop.  Kinetic  state  machines  serve  as  abstractions  between  discrete  (atoms)  inputs  and 
continuous  time  control.  The  working  of  the  kinetic  state  machine  (see  Fig  5)  is  as  fol¬ 
lows  -  let  us  assume  that  an  atom  (Ui,  T“)  is  executed  at  time  t  =  t0.  T  is  a  timer 
whose  output  is  1  (active  high)  while  t0  <  t  <  Ti  and  is  0  (active  low)  if  t  >  t0  +  T*. 
£i(s(t))  returns  an  interrupt  (active  low)  to  the  system  when  conditions  defined  by  either 
£j“(s(^))>  £?(SW  or  £?($(£)  are  satisfied.  Observe  here  that  the  interrupt  could  be  of  level 
2,1,  or  O.The  functioning  of  the  AND  gates  in  the  kinetic  state  machine  can  be  inter¬ 
preted  as  follows  -  if  either  the  KSM  receives  an  interrupt  or  t  >  t0  +  Ti  (either  7)a,  Tf ,  Tf) 


14 


set) 


X(t) 


MEMORY 

+ 

CARTOGRAPHER 


O' 

O 

£ 


z> 

ex 

X 


O 

ex 

o 


PLANNER 
PARTIAL  PLAN  fP 


LEVEL  2 


INTERRUPT 


c 


-  ENCODER.  j 


INTERRUPT] 

*  *  * 

.  .  . 

i  T 

K  | 

It, 

7E  | 

i 

1  H 
1 

V  1 

J 

“  . Y  . . 


l 

<*> 

On 

On 

‘ 

<T| 

0| 

0| 

J 

J 

BEHAVIOR 

BANK 


EMERGENT 

BEHAVIORS 


V _ J 


LANGUAGE 

X* 


SCALING 

FACTORS 

(aP) 


ALPHABET 

2 


TT1 

SENSORS 


Figure  4:  Hybrid  Control  Architecture 


the  input  to  gate  II  is  an  active  low  and  hence  the  input  to  the  kinetic  state  machine  is 
inhibited  i.e.  the  current  atom/behavior/partial  plan  (depending  on  the  interrupt  level) 
is  stopped  and  the  next  atom/behavior/partial  plan  in  the  respective  queue  is  executed. 

The  planner  is  the  highest  end  of  the  architecture  where  sensory  information  is  processed 
to  generate  goal-related  trajectory  information.  It  uses  information  from  its  memory  and 
the  “behavior  bank”  to  generate  a  partial  plan  to  achieve  the  desired  goal  based  on 
its  current  information  about  the  world.  This  partial  plan  is  in  the  form  of  actions 
or  symbols.  The  motor  system  serves  as  an  abstraction  between  these  symbols  and 
behaviors  encoded  in  MDLe.  Once  the  behaviors  have  been  encoded,  atoms  in  a  behavior 
are  executed  as  explained  above.  In  case  of  sequential  dependence  of  a  set  of  kinetic  state 
machines  on  one  another,  the  motor  system  introduces  “dummy  atoms”  (atoms  with  zero 
control  input)  into  the  respective  behaviors.  The  interrupt  functions  of  these  behaviors 
are  activated  by  one  another. 
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Figure  5:  Kinetic  State  Machine 

External  control  or  “directed  control”  [Blumberg  and  Galyean,  1995]  is  permitted  at  two 
levels.  The  user  can  inhibit  the  input  from  the  planner  and  introduce  a  valid  behavior 
from  the  existing  behavior  bank,  or  inhibit  the  execution  of  a  current  behavior  via  the 
“control  interrupt  vector”  (CIV).  The  control  interrupt  vector  is  an  interrupt  function 
£civ  appended  to  every  behavior,  i.e..  $  =  (£i)desired  A  Z,bciv.  In  its  default  state  the  CIV 
for  the  particular  behavior  is  at  a  logical  one,  but  can  be  set  to  zero  by  the  user  when  it 
is  desired  to  inhibit  the  execution  of  a  particular  behavior. 

As  explained  earlier,  often  the  planner  generates  plans  based  on  local  sensor  information. 
This  information  may  be  incorrect  and  might  result  in  some  of  the  atoms  in  a  behavior 
not  being  executed  or  executed  for  a  time  less  that  the  estimated  one.  If  a  behavior 
successfully  completes  it  tasks  receiving  only  zero  level  interrupts,  then  this  behavior  in 
its  “cleaned  up”  version  (removing  unnecessary  atoms,  and  scaling  T“  appropriately)  is 
loaded  into  the  behavior  bank.  Alternately  a  partial  plan  could  successfully  execute  with 
interrupts  of  level  zero  or  level  1.  Then  such  a  successful  partial  plan  (in  its  cleaned  up 
state),  which  is  essential  a  concatenation  of  behaviors,  is  introduced  into  the  behavior 
bank.  Hence  we  see  that  existing  atoms  and  behaviors  can  give  rise  to  new  emergent 
behaviors. 

The  layered  and  distributed  nature  of  the  control  becomes  apparent  when  one  observes 
that  once  a  plan  has  been  generated,  each  level  and  even  various  modules  at  the  same 
level  continue  to  execute  independently. 
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4  Application  of  MDLe  to  path  planning  with  non- 
holonomic  robots 

The  problems  of  obstacle  avoidance  and  path  planning  with  autonomous  mobile  robots 
have  been  studied  in  various  settings.  [Lumelsky,  1987;  Lozano-Perez,  1980;  Khatib,  1986; 
Koditschek,  1987;  Shahidi  et  al. ,  1991].  These  approaches  either  assumed  that  the  planner 
had  to  have  substantive  a  priori  information  about  the  location,  shapes  and  sizes  of 
obstacles,  or  assumed  that  the  constraints  on  the  robot  (geometric  and  kinematic)  were 
holonomic  or  integrable.  In  practice,  however,  most  real  world  robotic  systems  have  little 
a  priori  information  about  the  shapes  and  size  of  the  obstacles  and  in  addition  include 
kinematic  constraints  that  are  nonholonomic.  A  few  examples  of  nonholonomic  systems 
are  a  front  wheel  drive  car,  dextrous  manipulation  or  assembly  with  robotic  hands  and 
attitude  control  of  a  satellite.  As  traditional  path  planners  assume  arbitrary  motion  they 
cannot  be  applied  to  nonholonomic  robots  as  they  result  in  nonfeasible  trajectories  i.e. 
trajectories  that  do  not  satisfy  the  constraints  on  the  configuration  variables. 

More  recently,  researchers  have  been  examining  nonholonomic  path  planning  in  the  pres¬ 
ence  of  obstacles  [Laumond,  1990;  Barraquand  and  Latombe,  1989;  Mirtich  and  Canny, 
1992;  Hu  and  Brady,  1995].  However,  while  most  of  these  planners  provide  some  excel¬ 
lent  results  they  are  quite  rigid  in  the  choice  of  control  laws  used  to  steer  the  robots  and 
often  do  not  exploit  the  control  laws  available  in  control  literature,  for  example  [Murray 
and  Sastry,  1990;  Sussmann,  1991;  Coron,  1992;  de  Wit  and  Sordalen,  1992].  They  also 
assume  near  complete  a  priori  information  about  the  world  and  only  account  for  small 
changes  in  the  environment. 

MDLe  is  particularly  well  suited  to  the  demands  of  nonholonomic  motion-planning  with 
limited  range  sensors.  As  nonholonomic  robot  models  impose  limitations  on  stabilizabil- 
ity  via  smooth  feedback  [Brockett,  1983],  the  ability  to  piece  together  open-loop  and 
closed-loop  trajectories  becomes  essential.  MDLe  enables  one  to  describe  such  piecing 
together  in  a  systematic  manner.  As  an  example  of  the  strength  of  this  language,  we 
show  that  it  can  be  used  to  support  a  reactive  planner  for  nonholonomic  motion  plan¬ 
ning  in  the  presence  of  obstacles,  using  limited  range  sensors  for  obstacles  detection.  In 
addition,  the  system  assumes  no  a  priori  information  on  the  location  and  shapes  of  the 
obstacles. 

In  the  following  section  we  reinterpret  existing  results  in  literature  on  nonholonomic 
robots,  and  answer  questions  related  to  the  existence  and  choice  of  an  alphabet  E  (re¬ 
spectively  Ee)  which  can  be  used  to  generate  behaviors  and  hence  plans  to  achieve  a 
required  goal.  We  also  describe  (section  5.2)  how  we  can  update  world  models  and  pro¬ 
vide  examples  of  the  system’s  performance.  For  further  details  on  nonholonomic  motion 
planning  we  refer  the  reader  to  [Fernandes  et  al.,  1993]  and  references  therein. 
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4.1  Nonholonomic  Constraints 


In  addition  to  being  subject  to  geometric  constraints  many  robotic  systems  are  subject 
to  velocity  constraints.  These  velocity  constraints  are  represented  by  relations  between 
generalized  coordinates  and  their  velocities,  and  are  written  in  matrix  form  as 

A(q)q  =  0  (7) 

where  q  G  !Rn  determines  the  generalized  coordinates,  q  are  the  generalized  velocities 
and  A(q)  G  HfcXn  represents  a  set  of  k  velocity  constraints.  We  also  assume  that  A(q) 
has  full  row  rank.  Since  a  kinematic  constraint  restricts  the  allowable  velocities  and  not 
necessarily  the  configuration,  it  is  not  always  possible  to  represent  it  as  an  algebraic 
constraint  on  the  configuration  space.  A  kinematic  constraint  is  said  to  be  integrable  if 
there  exists  a  vector  valued  function  h  :  Q  lRfc  such  that 

A{q)q  =  0  =>  ^9  =  0  (8) 

An  integrable  kinematic  constraint  is  hence  equivalent  to  a  holonomic  constraint. 

Kinematic  constraints  that  are  not  integrable  are  said  to  be  nonholonomic.  The  con¬ 
straint  (1)  defines  a  (2 n  —  k )  dimensional  smooth  manifold  M  =  {{q,q)\A(q)q  =  0}. 
These  kinematic  constraints  generate  a  set  of  constraint  forces  so  as  to  ensure  that  the 
system  does  not  move  in  the  direction  of  the  rows  of  the  constraint  matrix  (see  fig  6).  In 
mechanical  systems  such  constraints  are  often  those  of  rolling  without  slipping,  conser¬ 
vation  of  angular  momentum  etc. 

If  the  controls  u{t )  G  Mm  satisfy  n  —  k  <  m  <  n  then  the  kinematics  are  sufficient  to 
model  the  system  and  (7)  can  be  written  in  the  form  of  a  drift  free  control  system 


771 

X  =  J2bi(X)Ui’  (9) 

i~  1 

with  state  x(t )  and  control  u(t),  and  each  bi  is  a  vector  field.  Often  such  drift  free 
nonholonomic  systems  are  controllable  (cf.[Murray  et  ai,  1994]). 

Proposition  1  Given  an  obstacle-free  environment  and  a  kinetic  state  machine  that  is 
governed  by  the  differential  equation 

m 

x  =  y]  bj(x)uj\  x  G  JRn,u  G  lRm  (10) 

i—1 

such  that  the  control  Lie  algebra  ( i.e .  the  vector  space  spanned  at  any  point  by  all  the  Lie 
brackets  of  the  vector  fields  bi)  has  rank  n,  then  there  exists  an  alphabet  E  (respectively 
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Figure  6:  Nonfeasible  trajectories  due  to  nonholonomic  constraints 

£eJ  which  can  be  used  to  generate  behaviors  (and  hence  plans)  to  steer  the  system  from 
a  given  initial  state  x0  to  a  final  state  Xf. 

Proof:  From  Chow’s  [Hermann,  1968]  theorem  we  know  that  if  the  control  Lie  Algebra 
has  rank  n  then  the  system  is  controllable.  This  implies  there  exist  piecewise  constant 
controls  u  :  [0,T]  — >■  Rm,T  >  0  that  steer  the  system  from  any  initial  state  zo(0)  to  any 
final  state  x/(T). 

A  simple  alphabet  that  can  be  used  to  generate  behaviors  consists  of  m  triples  of  the 
form  (Ui,  1, 1),  •  •  • ,  (Um,  1, 1)  a  e  1R,  0  €  R+  where 

U\  =  (1,0, 0,0,  •••,0)' 

Ui  =  (0, 1,0,0,  •■•,0)' 

Um  =  (0,0,0,  *-*,0,1)'  □ 


While  writing  down  the  equations  of  motion  it  is  sufficient  to  consider  the  evolution  of 

the  state  x  €  C  (the  configuration  space  of  the  robot).  For  path  planning  and  obstacle 

avoidance  one  needs  to  be  concerned  with  the  “material  points  of  the  robot”,  location 

and  calibration  of  sensors.  In  our  planner  we  identify  the  material  points  of  the  robot 

with  a  closed  subset,  denoted  by  Br,  of  R3.  Hence  Br  :  C  — >  T  :  x  i-4  Br(x),  where  T 

is  the  space  of  all  closed  connected  subsets  of  R3.  Further  we  define  BTk  :  C  — >■  J1  such 

that  at  any  given  time  t  =  t'  (i)  Br(x(t'))  C  Brk(x(t'))  and  (ii)  d(dBrk,dBr)  =  k,  where 

d(X,y)  =  min  ||(x, y)||  and  dBr  denotes  the  boundary  of  the  set  JET.  Lets  assume 
xex,  yey 

that  the  robot  is  equipped  with  limited  range  sensors  that  can  detect  an  obstacle  in  a 
Brp  neighborhood,  of  x.  Define  an  interrupt  £nc  as  follows. 

0  Vx  e  Bre,  e  <  p 

1  Vx  €  Brp\B[ 

Proposition  2  :  Give  a  kinetic  state  machine  governed  by  the  differential  equation  (10) 
then  any  behavior  with  an  interrupt  of  the  form 

&  =  l?A?„c 
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where  £nc  is  as  defined  by  (11)  will  result  in  a  collision  free  path. 


Since  the  robot  is  equipped  with  limited  range  sensors  in  our  planner  the  task  of  path 
planning  is  reduced  to  steering  in  obstacle  free  neighborhoods  denoted  by  Br0jd.  Using  a 
potential  function  approach  (see  section  5.1  for  details)  a  point  Xf  G  dBT0fd  is  identified 
to  which  the  planner  steers  the  robot.  As  the  size  of  BT0jd  decreases  due  to  nonholonomic 
constraints  finding  a  control  s.t.  x(t)  G  Br0jd  cannot  always  be  guaranteed.  But  for  a 
certain  class  of  robots  that  are  locally  locally  controllable  (LLC)5  we  can  guarantee  this. 

Proposition  3  If  the  system  defend  by  (11)  is  LLC  then  there  exists  a  behavior  (-,f6,T6) 
such  that  x(t)  G  BT0jd ,  Vt  G  [0,T6]. 

Observe  that  by  a  choice  of  (b  as  defined  in  (11)  non-collision  is  guaranteed  even  if 
material  points  of  the  robot  leave  Br0jd. 

5  PNMR:  Path  Planner  for  Nonholonomic  Mobile 
robots 

The  task  of  the  planner  is  to  use  the  limited-range  sensor  information,  to  generate  partial 
plans  that  result  in  collision-free  feasible  trajectories.  Planning  is  done  at  two  levels  - 
global  and  local.  For  local  planning,  collision-free  (non)feasible  paths  are  generated 
using  potential  functions  assuming  that  the  robot  is  holonomic.  A  partial  plan  (feasible 
path)  is  then  generated  that  obeys  the  constraints  in  the  configuration  variables.  As 
feasible  trajectories  are  only  approximations  to  the  trajectories  generated  using  potential 
functions,  collision  with  obstacles  could  occur  while  tracing  them.  While  the  robot  is 
in  motion,  collisions  are  avoided  by  using  the  sensor  information  to  trigger  interrupts  as 
described  previously.  The  task  of  the  path  planning  is  outlined  as  follows  (see  Fig  7): 

1.  Interpret  local  sensor  information  to  generate  a  “control  point”  and  an  obstacle  free 
neighborhood  containing  this  “control  point”  to  which  the  robot  is  to  be  steered. 

2.  From  the  given  alphabet  select  atoms  (U,  £,  T)  that  could  be  used  to  steer  the  robot 
(in  general,  depending  on  the  richness  of  the  alphabet  (S),  there  could  be  more 
than  one  behavior  to  steer  the  robot  to  the  control  point). 

3.  Calculate  the  scaling  factor  a  (crucial,  as  it  determines  the  speed  of  the  robot). 
Having  calculated  a,  calculate  or  approximate  /?,  the  duration  for  which  each  atom 
is  to  be  executed. 

5The  system  (9)  is  said  to  be  locally  locally  controllable  at  x0  if  given  any  e  >  0  there  exists  a  S  >  0 
such  that  all  points  in  the  ^-neighborhood  of  x0  can  be  linked  by  a  trajectory  of  (9)  which  does  not  leave 
the  e-neighborhood 
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Figure  7:  Navigation  Task  Decomposition 


4.  Generate  an  optimal  partial  plan,  by  minimizing  the  performance  measure  (6).  The 
minimization  is  performed  over  the  admissible  behaviors. 

5.  Execute  the  partial  plan  and  update  runtime  information  regarding  actual  time  of 
execution  of  behaviors  in  the  partial  plan,  sensor  information  etc. 

6.  Given  an  updated  world  and  partial  plans  generate  an  optimal  plan. 


At  a  global  level  heuristics,  along  with  the  world  map  generated  while  the  robot  is 
en  route  to  the  goal,  are  used  to  solve  the  problem  of  cycles.  One  should  note  here  that 
the  planner  could  be  used  with  most  nonholonomic  robots,  by  selecting  the  corresponding 
alphabet  and  associating  rules  with  the  selection  of  atoms.  In  our  simulations  we  have 
assumed  that  the  robot  is  modeled  along  the  lines  of  a  unicycle.  See  [Manikonda  et  al, 
1995a]  for  details  on  the  implementation  of  the  planner. 

5.1  Planning  in  the  Obstacle  Free  Disk 

To  find  the  best  direction  of  travel  in  the  obstacle  free  disk  we  use  the  approach  of 
potential  functions.  As  in  the  earlier  work  on  path  planning  with  potential  functions,  the 
idea  behind  our  construction  is  based  on  electrostatic  field  theory  -  charges  of  the  same 
sign  repel  and  charges  of  the  opposite  sign  attract  in  accordance  with  Coulomb’s  law. 
Hence  we  assign  a  positive  charge  distribution  to  the  obstacles  and  the  mobile  robot  and 
a  negative  charge  distribution  to  the  goal.  The  idea  is  to  construct  a  vector  field  which 
will  give  the  best  direction  of  travel  based  on  the  location  of  the  obstacles  and  the  goal. 

The  robot  is  approximated  to  a  point  robot  and  as  sensors  can  detect  only  points  on  the 
boundaries  of  the  obstacles  that  lie  in  their  line  of  vision,  we  treat  obstacles  as  a  collection 
of  point  charges  and  assign  charges  to  them  depending  on  which  sensor  detects  them. 
The  intersection  of  the  resultant  gradient  field  with  the  circumference  of  the  obstacle  free 
disks  gives  the  desired  location  to  which  robot  is  to  be  steered.  (One  should  observe  here 
that  unlike  earlier  approaches  the  gradient  field  is  not  directly  used  to  steer  the  robot. 
As  integral  curves  of  the  resultant  gradient  field  may  not  result  in  feasible  trajectories 
we  use  the  resultant  gradient  field  only  to  determine  the  scaling  factors  and  Xf  on  the 
circumference  of  the  obstacle  free  disk.) 

Once  the  initial  and  desired  final  state  of  the  robot  is  known  control  inputs  are  chosen 
from  those  available  in  the  language  to  generate  feasible  trajectories  to  steer  the  robot 
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from  the  initial  location  to  the  final  desired  location.  If  more  than  one  such  control 
achieves  the  task  then  the  performance  measure  can  be  used  to  select  the  optimum  one. 

As  we  are  using  a  kinematic  model,  of  the  robot  an  underlying  assumption  is  that  the 
robot  is  moving  at  low  velocities  and  we  can  bring  the  robot  to  a  halt  simply  by  turning 
off  the  controls.  To  determine  scaling  factors,  which  are  directly  related  to  the  velocities 
of  the  inputs  to  the  equations  governing  the  motion  of  the  robot,  we  use  the  sum  total 
of  both  the  attractive  and  repulsive  forces  to  determine  the  bounds  on  the  velocities  and 
hence  the  bounds  on  the  scaling  factors.  A  simple  example  of  such  a  function  is  given  by 
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where  fa  and  fr  are  the  net  attractive  and  repulsive  forces  acting  on  the  robot.  Observe 
that  when  the  robot  is  close  to  either  the  obstacle,  the  goal  or  both,  it  moves  with  a 
lower  velocity  hence  making  the  kinematic  model  more  realistic. 

By  intelligently  choosing  weights  on  the  charges  (see  [Manikonda,  1994]  for  more  details) 
we  can  ensure  that  the  robot  either  avoids  the  obstacle  or  gets  close  enough  to  an  obstacle 
6  such  that  Brofd  <  Brcrt  in  which  case  it  traces  the  boundaries  of  obstacles  to  a  point 
where  it  finds  an  edge  or  is  heading  in  the  direction  of  the  goal. 

Remark:  It  is  important  to  mention  here  that  as  we  are  making  no  assumptions  on 
the  location  sizes  or  shapes  of  the  obstacles  guaranteeing  the  existence  of  a  path  is  very 
difficult,  though  empirical  results  have  shown  that  if  a  path  exists  the  robot  has  more 
often  than  not  found  it.  More  importance  here  is  stressed  on  the  ability  of  the  planner 
to  integrate  real  time  sensor  information  with  control-theory-based  approaches  to  steer 
nonholonomic  systems  in  a  systematic  way.  As  mentioned  earlier,  nonholonomic  robots 
impose  limitations  on  stabilizability  via  smooth  feedback  and  the  planner  developed 
under  the  framework  of  the  language  provides  an  elegant  way  of  piecing  together  various 
control  strategies. 


Tracing  Boundaries  Planning  in  Brcrt  is  a  closed  loop  planning  strategy  which  essen¬ 
tially  results  in  a  trace  behavior  that  traces  the  boundaries  of  the  obstacles.  Given  the 
limited  sensor  and  world  information  it  is  probable  that  the  direction  of  trace  may  have 
been  wrong.  Hence  we  use  a  heuristic  function  f(x)  —  D(Xrobot,  Xinit),  the  Euclidean 
distance  between  XTobot  and  Xinu  (the  point  at  which  the  trace  behavior  was  started) 
as  an  estimate  of  how  far  the  robot  has  strayed  from  the  goal.  The  robot  traces  the 
boundary  as  long  as  f  <  fs  where  fs  is  some  permitted  distance  from  where  the  trace 
behavior  was  started.  If  /  >  fs  then  we  retrace  path  and  trace  the  boundary  of  the 
obstacle  in  the  opposite  direction.  If  terminal  conditions  for  the  trace  are  not  met,  we 
set  fs  =  afs,  a  >  1,  and  repeat. 

6  In  the  implementation  of  the  planner  we  identify  a  critical  radius  in  which  the  robot  changes  its 
control  strategy 
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Remark:  Retracing  a  path  under  this  framework  is  a  rather  simple  task.  Observing  that 
the  system  is  a  drift  free  system,  retracing  involves  executing  the  past  n  partial  plans  in 
a  reverse  order  with  (—a)  scaling  factor. 

Fig.  8  shows  some  paths  generated  by  the  planner  for  a  robot  modeled  along  the  lines  of 
a  unicycle.7  It  is  important  to  note  that  while  the  plan  is  being  executed  the  sensors  are 
being  continuously  scanned  and  are  present  in  a  low  level  feedback  loop  hence  preventing 
any  collisions  with  obstacles. 


Obstacles 


(^J  Obstacle  free  disks 


Figure  8:  Paths  Generated  by  the  Planner 


7It  should  be  pointed  out  here  that  the  obstacle-free  disks  generated  by  the  planner  violate  the  exact 
definition  given  above,  but  this  is  because  in  the  simulator  we  have  used  only  sensors  in  the  “front”  of 
the  robot  to  generate  obstacle-free  disks.  For  now,  those  obstacles  that  are  not  detected  by  the  sensors 
are  treated  as  being  in  the  blind  spots  of  the  robot. 
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5.2  World  Model  Update 


Once  the  robot  has  explored  the  environment  using  limited  range  sensors,  it  is  natural  to 
expect  the  robot  to  generate  plans  of  a  better  performance  if  it  has  to  repeat  the  same 
task  or  move  to  goal  that  lie  in  the  explore  regions.  We  use  a  “learning  algorithm”  that 
improves  the  performance  of  a  plan  to  bring  it  closer  to  optimal. 

As  described  above,  the  plan  to  steer  a  robot  consists  of  a  sequence  of  partial  plans, 
where  each  partial  plan  steers  the  robot  in  some  obstacle  free  disk  of  radius  Brofd.  In 
the  rest  of  this  section  let  us  denote  each  of  the  obstacle  free  disks  which  were  used  to 
generate  the  ith  partial  plan  as  B{  and  the  ?th  partial  plan  as  Tf.  Further  let  us  assume 
that  n  such  partial  plans  were  generated.  Once  the  plan  has  been  generated  the  planner 
uses  following: 

(i) .  If  Bi  C  Bj ,  i  =  1,  •  •  -n,  j  =  1,  •  •  -n  (i.e.  B{  is  contained  in  Bj )  then  obviously  Bi 
contains  redundant  information.  Thus  if  Bi  C  Bi+ 1  the  partial  plan  Ff,  that  steers  the 
robot  from  Ci  to  C,-+i,  where  Ci  is  the  center  of  the  obstacle  free  disk,  and  partial  plan 
r?+1  that  steers  the  robot  form  Ci+\  to  Ci+2  can  be  replaced  by  a  partial  plan  rf  that 
steers  the  robot  from  Ci  to  Ci+2 ■  Since  Bi  C  Bi+ 1  it  is  obvious  that  0(f?)  <  0(Ffrf+1) 

(ii) .  Observe  that  since  Ci+ 1  lies  on  the  boundary  of  Cj,  we  are  guaranteed  the  existence  of 
a  trivial  nonfeasible  trajectory  (the  straight  line  joining  Ci  with  Ci+2)  that  lies  entirely  in 
Bi  U  Bi+i  i.e.  the  obstacle  free  area  enclosed  by  these  two  intersecting  obstacle  free  disks. 
Hence  if  there  exists  a  partial  plan  that  generates  a  feasible  trajectory  that  can  track 
this  nonfeasible  trajectory  and  lie  entirely  in  Bi\JBi+i  such  that  0(f?)  <  ©(rfrf+1) 
we  can  replace  Tfrf+1  by  f? 

(iii)  After  the  execution  of  (i)  and  (ii)  we  now  have  partial  plans  that  steer  the  robot 
from  Ci  to  Ci+j,  j  e  {2,  ■■■n}  such  that  the  trajectory  lies  entirely  in  \St3Bi.  The 
planner  now  explores  the  possibility  of  finding  (non) feasible  trajectories  from  Ci  to 
Ci+j+k,j  G  2 =  1, •••,n  such  that  these  trajectories  lie  entirely  in  U \+3+kBi 
and  the  performance  of  the  plan  that  generates  this  trajectory  is  better  than  the  earlier 
one. 

Fig.  9  shows  paths  generated  by  the  planner  after  it  has  gained  partial  knowledge  of  the 
world  it  has  explored  in  its  first  attempt  to  reach  the  goal.  The  bold  solid  lines  denote 
the  new  trajectories  (partial  plans)  generated  after  partial  knowledge  of  the  world  has 
been  gained.  It  clearly  shows  an  improvement  in  the  performance  of  the  planner  as  the 
length  of  the  plan  is  nearly  a  third  of  the  plan  generated  in  the  first  attempt. 

Remarks:  (i).  One  should  note  that  generating  plans  of  better  performance  does  not 
necessarily  imply  that  |ff|  <  |Tf|  where  ff  is  the  new  plan,  but  rather  could  simply 
imply  the  choice  of  better  scaling  factors  a,  [3  such  that  T(Tf)  <  T(rf). 

(ii).  One  need  not  restrict  the  generation  on  nonfeasible  trajectories  to  straight  line 
segments,  but  could  instead  use  arc  or  even  curves  that  best  fit  the  centers  of  these 
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obstacle  free  disks. 


6  Conclusions 


This  paper  is  an  attempt  to  bring  together  aspects  of  motion  control  as  discussed  by 
researchers  in  the  communities  of  Behavioral  Sciences,  Artificial  Intelligence  and  Control 
theory.  We  provide  a  language,  a  framework  and  a  hybrid  architecture  to  integrate  fea¬ 
tures  of  reactive  planning  methods  with  control-theoretic  approaches  to  motion  control. 
The  hybrid  language  permits  planning  using  a  set  of  behaviors  but  at  the  same  time  the 
incorporation  of  differential  equations  in  the  language  makes  it  possible  to  formalize,  com¬ 
pare  and  generate  behaviors  that  improve  over  time,  generate  maps,  etc.  It  is  clear  that 
in  a  task  such  as  motion-planning  of  systems  with  complex  dynamics,  under-actuated 
controls  and  limited  range  sensors,  it  is  helpful  to  be  able  to  switch  between  behaviors 
that  rely  on  the  direct  coupling  of  sensory  information  and  actuators  and  steering  us¬ 
ing  modern  control-theory-based  approaches.  Our  system  shows  that  these  two  can  be 
smoothly  integrated,  at  least  for  this  form  of  nonholonomic  robot  path  planning.  Future 
work  includes  extending  the  language  to  continue  formalization  of  behaviors,  including 
multiple  kinetic  state  machines  in  the  language  and  implementation  of  the  planner  to 
control  a  physical,  as  opposed  to  a  simulated,  robot. 
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